Picture for Eng Siong Chng

Eng Siong Chng

Stream-Voice-Anon: Enhancing Utility of Real-Time Speaker Anonymization via Neural Audio Codec and Language Models

Add code
Jan 20, 2026
Viaarxiv icon

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing

Add code
Jan 14, 2026
Viaarxiv icon

Improving Code-Switching Speech Recognition with TTS Data Augmentation

Add code
Jan 02, 2026
Viaarxiv icon

DepFlow: Disentangled Speech Generation to Mitigate Semantic Bias in Depression Detection

Add code
Jan 01, 2026
Viaarxiv icon

GenTSE: Enhancing Target Speaker Extraction via a Coarse-to-Fine Generative Language Model

Add code
Dec 24, 2025
Viaarxiv icon

Next-Frame Feature Prediction for Multimodal Deepfake Detection and Temporal Localization

Add code
Nov 13, 2025
Viaarxiv icon

Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception

Add code
Oct 14, 2025
Viaarxiv icon

Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in Spoken Language Models

Add code
Oct 10, 2025
Viaarxiv icon

Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function

Add code
Sep 11, 2025
Figure 1 for Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function
Figure 2 for Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function
Figure 3 for Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function
Figure 4 for Improving Synthetic Data Training for Contextual Biasing Models with a Keyword-Aware Cost Function
Viaarxiv icon

Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation

Add code
Aug 25, 2025
Viaarxiv icon